{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9e1ac4fceb263946",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_with_milvus_and_crawl4ai.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>   <a href=\"https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_with_milvus_and_crawl4ai.ipynb\" target=\"_blank\">\n",
    "    <img src=\"https://img.shields.io/badge/View%20on%20GitHub-555555?style=flat&logo=github&logoColor=white\" alt=\"GitHub Repository\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1e6d7d051e1b5d9",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# Building RAG with Milvus and Crawl4AI\n",
    "\n",
    "[Crawl4AI](https://crawl4ai.com/mkdocs/) delivers blazing-fast, AI-ready web crawling for LLMs. Open-source and optimized for RAG, it simplifies scraping with advanced extraction and real-time performance.\n",
    "\n",
    "In this tutorial, we’ll show you how to build a Retrieval-Augmented Generation (RAG) pipeline using Milvus and Crawl4AI. The pipeline integrates Crawl4AI for web data crawling, Milvus for vector storage, and OpenAI for generating insightful, context-aware responses.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7f4056ca7aca338",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Preparation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9d871e3c89b776",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Dependencies and Environment\n",
    "\n",
    "To start, install the required dependencies by running the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "initial_id",
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install -U crawl4ai pymilvus openai requests tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6281694e62787a4b",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the \"Runtime\" menu at the top of the screen, and select \"Restart session\" from the dropdown menu)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f5a85b9a62416a3",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "To fully set up crawl4ai, run the following commands:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7eeefc4b638fbd22",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:09.133696Z",
     "start_time": "2025-01-14T09:25:02.169833Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[36m[INIT].... → Running post-installation setup...\u001b[0m\r\n",
      "\u001b[36m[INIT].... → Installing Playwright browsers...\u001b[0m\r\n",
      "\u001b[32m[COMPLETE] ● Playwright installation completed successfully.\u001b[0m\r\n",
      "\u001b[36m[INIT].... → Starting database initialization...\u001b[0m\r\n",
      "\u001b[32m[COMPLETE] ● Database initialization completed successfully.\u001b[0m\r\n",
      "\u001b[32m[COMPLETE] ● Post-installation setup completed!\u001b[0m\r\n",
      "\u001b[0m\u001b[36m[INIT].... → Running Crawl4AI health check...\u001b[0m\r\n",
      "\u001b[36m[INIT].... → Crawl4AI 0.4.247\u001b[0m\r\n",
      "\u001b[36m[TEST].... ℹ Testing crawling capabilities...\u001b[0m\r\n",
      "\u001b[36m[EXPORT].. ℹ Exporting PDF and taking screenshot took 0.80s\u001b[0m\r\n",
      "\u001b[32m[FETCH]... ↓ https://crawl4ai.com... | Status: \u001b[32mTrue\u001b[0m | Time: 4.22s\u001b[0m\r\n",
      "\u001b[36m[SCRAPE].. ◆ Processed https://crawl4ai.com... | Time: 14ms\u001b[0m\r\n",
      "\u001b[32m[COMPLETE] ● https://crawl4ai.com... | Status: \u001b[32mTrue\u001b[0m | Total: \u001b[33m4.23s\u001b[0m\u001b[0m\r\n",
      "\u001b[32m[COMPLETE] ● ✅ Crawling test passed!\u001b[0m\r\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "# Run post-installation setup\n",
    "! crawl4ai-setup\n",
    "\n",
    "# Verify installation\n",
    "! crawl4ai-doctor"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75596dd3e4d1d8de",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Setting Up OpenAI API Key\n",
    "\n",
    "We will use OpenAI as the LLM in this example. You should prepare the [OPENAI_API_KEY](https://platform.openai.com/docs/quickstart) as an environment variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "e14a99a051098a87",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:09.139980Z",
     "start_time": "2025-01-14T09:25:09.134353Z"
    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-***********\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "baf040ed69755a72",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Prepare the LLM and Embedding Model\n",
    "\n",
    "We initialize the OpenAI client to prepare the embedding model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "6ac107df-9e41-4afe-be70-1ff2cfd8a727",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:11.087058Z",
     "start_time": "2025-01-14T09:25:10.712174Z"
    }
   },
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "\n",
    "openai_client = OpenAI()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fcd7358761d1e78",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Define a function to generate text embeddings using OpenAI client. We use the [text-embedding-3-small](https://platform.openai.com/docs/guides/embeddings) model as an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "d90e40d0-44a4-43a2-8567-8c8d758d1bd4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:13.161855Z",
     "start_time": "2025-01-14T09:25:13.153595Z"
    }
   },
   "outputs": [],
   "source": [
    "def emb_text(text):\n",
    "    return (\n",
    "        openai_client.embeddings.create(input=text, model=\"text-embedding-3-small\")\n",
    "        .data[0]\n",
    "        .embedding\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d4cc6ea-e4fe-4721-8c5a-e930c1c3ff9b",
   "metadata": {},
   "source": [
    "Generate a test embedding and print its dimension and first few elements."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "8be4219c-9ead-45bb-aea1-15dd1143b3f1",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:14.948153Z",
     "start_time": "2025-01-14T09:25:14.442989Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1536\n",
      "[0.009889289736747742, -0.005578675772994757, 0.00683477520942688, -0.03805781528353691, -0.01824733428657055, -0.04121600463986397, -0.007636285852640867, 0.03225184231996536, 0.018949154764413834, 9.352207416668534e-05]\n"
     ]
    }
   ],
   "source": [
    "test_embedding = emb_text(\"This is a test\")\n",
    "embedding_dim = len(test_embedding)\n",
    "print(embedding_dim)\n",
    "print(test_embedding[:10])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e4a0bb710a360517",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Crawl Data Using Crawl4AI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "100fa4fd9c1e003",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:18.538915Z",
     "start_time": "2025-01-14T09:25:17.589353Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[INIT].... → Crawl4AI 0.4.247\n",
      "[FETCH]... ↓ https://lilianweng.github.io/posts/2023-06-23-agen... | Status: True | Time: 0.07s\n",
      "[COMPLETE] ● https://lilianweng.github.io/posts/2023-06-23-agen... | Status: True | Total: 0.08s\n"
     ]
    }
   ],
   "source": [
    "from crawl4ai import *\n",
    "\n",
    "\n",
    "async def crawl():\n",
    "    async with AsyncWebCrawler() as crawler:\n",
    "        result = await crawler.arun(\n",
    "            url=\"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n",
    "        )\n",
    "        return result.markdown\n",
    "\n",
    "\n",
    "markdown_content = await crawl()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "229449707ff84d2d",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Process the Crawled Content\n",
    "\n",
    "To make the crawled content manageable for insertion into Milvus, we simply use \"# \" to separate the content, which can roughly separate the content of each main part of the crawled markdown file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "1f03911422e6a973",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:24.778624Z",
     "start_time": "2025-01-14T09:25:24.753005Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Section 1:\n",
      "[Lil'Log](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/lilianweng.github.io/> \"Lil'Log \\(Alt + H\\)\")\n",
      "  * |\n",
      "\n",
      "\n",
      "  * [ Posts ](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/lilianweng.github.io/> \"Posts\")\n",
      "  * [ Archive ](https://lilianweng.github.io/posts/2023-06-23-agent/<h...\n",
      "--------------------------------------------------\n",
      "Section 2:\n",
      "LLM Powered Autonomous Agents \n",
      "Date: June 23, 2023 | Estimated Reading Time: 31 min | Author: Lilian Weng \n",
      "Table of Contents\n",
      "  * [Agent System Overview](https://lilianweng.github.io/posts/2023-06-23-agent/<#agent-system-overview>)\n",
      "  * [Component One: Planning](https://lilianweng.github.io/posts/2023...\n",
      "--------------------------------------------------\n",
      "Section 3:\n",
      "Agent System Overview[#](https://lilianweng.github.io/posts/2023-06-23-agent/<#agent-system-overview>)\n",
      "In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n",
      "  * **Planning**\n",
      "    * Subgoal and decomposition: The agent breaks down large t...\n",
      "--------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "def split_markdown_content(content):\n",
    "    return [section.strip() for section in content.split(\"# \") if section.strip()]\n",
    "\n",
    "\n",
    "# Process the crawled markdown content\n",
    "sections = split_markdown_content(markdown_content)\n",
    "\n",
    "# Print the first few sections to understand the structure\n",
    "for i, section in enumerate(sections[:3]):\n",
    "    print(f\"Section {i+1}:\")\n",
    "    print(section[:300] + \"...\")\n",
    "    print(\"-\" * 50)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1353faca577a4d7",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Load Data into Milvus"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08d46563-ac2f-4951-a1fc-920577750a44",
   "metadata": {},
   "source": [
    "### Create the collection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "1ca09192-a150-46eb-86c5-de9f10546a41",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:29.325267Z",
     "start_time": "2025-01-14T09:25:27.696938Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:numexpr.utils:Note: NumExpr detected 10 cores but \"NUMEXPR_MAX_THREADS\" not set, so enforcing safe limit of 8.\n",
      "INFO:numexpr.utils:NumExpr defaulting to 8 threads.\n"
     ]
    }
   ],
   "source": [
    "from pymilvus import MilvusClient\n",
    "\n",
    "milvus_client = MilvusClient(uri=\"./milvus_demo.db\")\n",
    "collection_name = \"my_rag_collection\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d4e1e20-4fb3-4a20-b8fd-5e20497b0c62",
   "metadata": {},
   "source": [
    "> As for the argument of `MilvusClient`:\n",
    "> - Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.\n",
    "> - If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`.\n",
    "> - If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fadf5572-f0ea-4799-9247-2e92e58bfe20",
   "metadata": {},
   "source": [
    "Check if the collection already exists and drop it if it does."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "d7e81fae-b312-408a-82f3-c301928bb9df",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:31.948948Z",
     "start_time": "2025-01-14T09:25:31.935055Z"
    }
   },
   "outputs": [],
   "source": [
    "if milvus_client.has_collection(collection_name):\n",
    "    milvus_client.drop_collection(collection_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41e082b4-93e1-438a-800b-ec34174d43f8",
   "metadata": {},
   "source": [
    "Create a new collection with specified parameters.\n",
    "\n",
    "If we don’t specify any field information, Milvus will automatically create a default `id` field for primary key, and a `vector` field to store the vector data. A reserved JSON field is used to store non-schema-defined fields and their values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "97bd0406-ad92-4b18-8702-2dbb666e5f46",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:36.692203Z",
     "start_time": "2025-01-14T09:25:36.167918Z"
    }
   },
   "outputs": [],
   "source": [
    "milvus_client.create_collection(\n",
    "    collection_name=collection_name,\n",
    "    dimension=embedding_dim,\n",
    "    metric_type=\"IP\",  # Inner product distance\n",
    "    consistency_level=\"Strong\",  # Strong consistency level\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a488258-64d5-4539-942f-dd50d90300e4",
   "metadata": {},
   "source": [
    "### Insert data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "a2ac980d-5969-4fe9-97b7-5a65c78c9224",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:48.142063Z",
     "start_time": "2025-01-14T09:25:38.181051Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Processing sections:   0%|          | 0/18 [00:00<?, ?it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:   6%|▌         | 1/18 [00:00<00:12,  1.37it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  11%|█         | 2/18 [00:01<00:11,  1.39it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  17%|█▋        | 3/18 [00:02<00:10,  1.40it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  22%|██▏       | 4/18 [00:02<00:07,  1.85it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  28%|██▊       | 5/18 [00:02<00:06,  2.06it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  33%|███▎      | 6/18 [00:03<00:06,  1.94it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  39%|███▉      | 7/18 [00:03<00:05,  2.14it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  44%|████▍     | 8/18 [00:04<00:04,  2.29it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  50%|█████     | 9/18 [00:04<00:04,  2.20it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  56%|█████▌    | 10/18 [00:05<00:03,  2.09it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  61%|██████    | 11/18 [00:06<00:04,  1.68it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  67%|██████▋   | 12/18 [00:06<00:04,  1.48it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  72%|███████▏  | 13/18 [00:07<00:02,  1.75it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  78%|███████▊  | 14/18 [00:07<00:01,  2.02it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  83%|████████▎ | 15/18 [00:07<00:01,  2.12it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  89%|████████▉ | 16/18 [00:08<00:01,  1.61it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections:  94%|█████████▍| 17/18 [00:09<00:00,  1.92it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "Processing sections: 100%|██████████| 18/18 [00:09<00:00,  1.83it/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'insert_count': 18, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], 'cost': 0}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "data = []\n",
    "for i, section in enumerate(tqdm(sections, desc=\"Processing sections\")):\n",
    "    embedding = emb_text(section)\n",
    "    data.append({\"id\": i, \"vector\": embedding, \"text\": section})\n",
    "\n",
    "# Insert data into Milvus\n",
    "milvus_client.insert(collection_name=collection_name, data=data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01b502fb-21b4-48dc-b624-93b510538fbe",
   "metadata": {},
   "source": [
    "## Build RAG"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7120f6b1-4f07-4b70-b1f1-dc47a9cd7524",
   "metadata": {},
   "source": [
    "### Retrieve data for a query\n",
    "\n",
    "Let’s specify a query question about the website we just crawled.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "33f1f275-9038-4836-9189-d55efed7fab7",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:56.603131Z",
     "start_time": "2025-01-14T09:25:56.595437Z"
    }
   },
   "outputs": [],
   "source": [
    "question = \"What are the main components of autonomous agents?\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0446d285-9e27-45be-a5f1-29bbf55fa2e9",
   "metadata": {},
   "source": [
    "Search for the question in the collection and retrieve the semantic top-3 matches."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "ad2e5d1e-6550-41d8-928b-a34562b86790",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:25:59.359036Z",
     "start_time": "2025-01-14T09:25:58.737765Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
     ]
    }
   ],
   "source": [
    "search_res = milvus_client.search(\n",
    "    collection_name=collection_name,\n",
    "    data=[emb_text(question)],\n",
    "    limit=3,\n",
    "    search_params={\"metric_type\": \"IP\", \"params\": {}},\n",
    "    output_fields=[\"text\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "113d8b42-0397-4015-b7aa-8861563bcf14",
   "metadata": {},
   "source": [
    "Let’s take a look at the search results of the query\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "ac85b108-8fbc-4eb9-82e1-6b9058c25388",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:26:02.278534Z",
     "start_time": "2025-01-14T09:26:02.270934Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[\n",
      "    [\n",
      "        \"Agent System Overview[#](https://lilianweng.github.io/posts/2023-06-23-agent/<#agent-system-overview>)\\nIn a LLM-powered autonomous agent system, LLM functions as the agent\\u2019s brain, complemented by several key components:\\n  * **Planning**\\n    * Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\\n    * Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\\n  * **Memory**\\n    * Short-term memory: I would consider all the in-context learning (See [Prompt Engineering](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/lilianweng.github.io/posts/2023-03-15-prompt-engineering/>)) as utilizing short-term memory of the model to learn.\\n    * Long-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\\n  * **Tool use**\\n    * The agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.\\n\\n![](https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png) Fig. 1. Overview of a LLM-powered autonomous agent system.\",\n",
      "        0.6433743238449097\n",
      "    ],\n",
      "    [\n",
      "        \"LLM Powered Autonomous Agents \\nDate: June 23, 2023 | Estimated Reading Time: 31 min | Author: Lilian Weng \\nTable of Contents\\n  * [Agent System Overview](https://lilianweng.github.io/posts/2023-06-23-agent/<#agent-system-overview>)\\n  * [Component One: Planning](https://lilianweng.github.io/posts/2023-06-23-agent/<#component-one-planning>)\\n    * [Task Decomposition](https://lilianweng.github.io/posts/2023-06-23-agent/<#task-decomposition>)\\n    * [Self-Reflection](https://lilianweng.github.io/posts/2023-06-23-agent/<#self-reflection>)\\n  * [Component Two: Memory](https://lilianweng.github.io/posts/2023-06-23-agent/<#component-two-memory>)\\n    * [Types of Memory](https://lilianweng.github.io/posts/2023-06-23-agent/<#types-of-memory>)\\n    * [Maximum Inner Product Search (MIPS)](https://lilianweng.github.io/posts/2023-06-23-agent/<#maximum-inner-product-search-mips>)\\n  * [Component Three: Tool Use](https://lilianweng.github.io/posts/2023-06-23-agent/<#component-three-tool-use>)\\n  * [Case Studies](https://lilianweng.github.io/posts/2023-06-23-agent/<#case-studies>)\\n    * [Scientific Discovery Agent](https://lilianweng.github.io/posts/2023-06-23-agent/<#scientific-discovery-agent>)\\n    * [Generative Agents Simulation](https://lilianweng.github.io/posts/2023-06-23-agent/<#generative-agents-simulation>)\\n    * [Proof-of-Concept Examples](https://lilianweng.github.io/posts/2023-06-23-agent/<#proof-of-concept-examples>)\\n  * [Challenges](https://lilianweng.github.io/posts/2023-06-23-agent/<#challenges>)\\n  * [Citation](https://lilianweng.github.io/posts/2023-06-23-agent/<#citation>)\\n  * [References](https://lilianweng.github.io/posts/2023-06-23-agent/<#references>)\\n\\n\\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as [AutoGPT](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/github.com/Significant-Gravitas/Auto-GPT>), [GPT-Engineer](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/github.com/AntonOsika/gpt-engineer>) and [BabyAGI](https://lilianweng.github.io/posts/2023-06-23-agent/<https:/github.com/yoheinakajima/babyagi>), serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\",\n",
      "        0.5462194085121155\n",
      "    ],\n",
      "    [\n",
      "        \"Component One: Planning[#](https://lilianweng.github.io/posts/2023-06-23-agent/<#component-one-planning>)\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\n#\",\n",
      "        0.5223420858383179\n",
      "    ]\n",
      "]\n"
     ]
    }
   ],
   "source": [
    "import json\n",
    "\n",
    "retrieved_lines_with_distances = [\n",
    "    (res[\"entity\"][\"text\"], res[\"distance\"]) for res in search_res[0]\n",
    "]\n",
    "print(json.dumps(retrieved_lines_with_distances, indent=4))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1fecbb753851d312",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Use LLM to get a RAG response\n",
    "\n",
    "Convert the retrieved documents into a string format.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c7d406532a6c0d59",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:26:08.436450Z",
     "start_time": "2025-01-14T09:26:08.427915Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "context = \"\\n\".join(\n",
    "    [line_with_distance[0] for line_with_distance in retrieved_lines_with_distances]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34aabd19ee91b7d",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Define system and user prompts for the Lanage Model. This prompt is assembled with the retrieved documents from Milvus.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "42c4f2f501f80607",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:26:11.452524Z",
     "start_time": "2025-01-14T09:26:11.445572Z"
    },
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "SYSTEM_PROMPT = \"\"\"\n",
    "Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided.\n",
    "\"\"\"\n",
    "USER_PROMPT = f\"\"\"\n",
    "Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.\n",
    "<context>\n",
    "{context}\n",
    "</context>\n",
    "<question>\n",
    "{question}\n",
    "</question>\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fe76342b1c2ea6c",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Use OpenAI ChatGPT to generate a response based on the prompts.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "c340830791e55c03",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-01-14T09:26:26.095984Z",
     "start_time": "2025-01-14T09:26:23.196524Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The main components of autonomous agents are:\n",
      "\n",
      "1. **Planning**:\n",
      "   - Subgoal and decomposition: Breaking down large tasks into smaller, manageable subgoals.\n",
      "   - Reflection and refinement: Self-criticism and reflection to learn from past actions and improve future steps.\n",
      "\n",
      "2. **Memory**:\n",
      "   - Short-term memory: In-context learning using prompt engineering.\n",
      "   - Long-term memory: Retaining and recalling information over extended periods using an external vector store and fast retrieval.\n",
      "\n",
      "3. **Tool use**:\n",
      "   - Calling external APIs for information not contained in the model weights, accessing current information, code execution capabilities, and proprietary information sources.\n"
     ]
    }
   ],
   "source": [
    "response = openai_client.chat.completions.create(\n",
    "    model=\"gpt-4o\",\n",
    "    messages=[\n",
    "        {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n",
    "        {\"role\": \"user\", \"content\": USER_PROMPT},\n",
    "    ],\n",
    ")\n",
    "print(response.choices[0].message.content)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da31f19e0c8b5bb8",
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
